AITopics | robust deep reinforcement learning

Collaborating Authors

robust deep reinforcement learning

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Robust Deep Reinforcement Learning through Adversarial Loss

Neural Information Processing SystemsDec-25-2025, 00:47:57 GMT

Recent studies have shown that deep reinforcement learning agents are vulnerable to small adversarial perturbations on the agent's inputs, which raises concerns about deploying such agents in the real world. To address this issue, we propose RADIAL-RL, a principled framework to train reinforcement learning agents with improved robustness against $l_p$-norm bounded adversarial attacks. Our framework is compatible with popular deep reinforcement learning algorithms and we demonstrate its performance with deep Q-learning, A3C and PPO. We experiment on three deep RL benchmarks (Atari, MuJoCo and ProcGen) to show the effectiveness of our robust training algorithm. Our RADIAL-RL agents consistently outperform prior methods when tested against attacks of varying strength and are more computationally efficient to train. In addition, we propose a new evaluation method called Greedy-Worst-Case Reward (GWC) to measure attack agnostic robustness of deep RL agents. We show that GWC can be evaluated efficiently and is a good estimate of the reward under the worst possible sequence of adversarial attacks.

adversarial loss, name change, robust deep reinforcement learning, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Robust Deep Reinforcement Learning against Adversarial Perturbations on State Observations

Neural Information Processing SystemsDec-24-2025, 21:08:11 GMT

A deep reinforcement learning (DRL) agent observes its states through observations, which may contain natural measurement errors or adversarial noises.

adversarial perturbation, name change, robust deep reinforcement learning, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.66)

Add feedback

Review for NeurIPS paper: Robust Deep Reinforcement Learning against Adversarial Perturbations on State Observations

Neural Information Processing SystemsFeb-8-2025, 01:46:23 GMT

Clarity: *** Derivations in Section 3 *** While the theorems across Section 3.1 seem reasonable I would have liked some a more self-contained presentation of theorems together with proofs. Assumption 2 (Bounded adversary power) is a bit strange, and while the experimental implementation (with the norm ball around s) seems reasonable for many environments, this should probably be defined in a better way. The authors refer to the Appendix a lot and in my opinion such derivations are necessary for the reader to follow along. I cannot really follow how the authors get there. Add Plots (similar to Appendix I, Figure 12).

adversarial perturbation, robust deep reinforcement learning, state observation, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.85)

Add feedback

Review for NeurIPS paper: Robust Deep Reinforcement Learning against Adversarial Perturbations on State Observations

Neural Information Processing SystemsFeb-8-2025, 01:46:15 GMT

The reviewers agreed that this is a strong, well-executed, and potentially high-impact work on a very timely topic. The reviewers also pointed out several presentation issues, and potential solutions. Please address these in the final version.

adversarial perturbation, robust deep reinforcement learning, state observation, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.85)

Add feedback

Robust Deep Reinforcement Learning through Adversarial Loss

Neural Information Processing SystemsJan-19-2025, 09:31:29 GMT

Recent studies have shown that deep reinforcement learning agents are vulnerable to small adversarial perturbations on the agent's inputs, which raises concerns about deploying such agents in the real world. To address this issue, we propose RADIAL-RL, a principled framework to train reinforcement learning agents with improved robustness against l_p -norm bounded adversarial attacks. Our framework is compatible with popular deep reinforcement learning algorithms and we demonstrate its performance with deep Q-learning, A3C and PPO. We experiment on three deep RL benchmarks (Atari, MuJoCo and ProcGen) to show the effectiveness of our robust training algorithm. Our RADIAL-RL agents consistently outperform prior methods when tested against attacks of varying strength and are more computationally efficient to train.

adversarial loss, agent, robust deep reinforcement learning, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Robust Deep Reinforcement Learning against Adversarial Perturbations on State Observations

Neural Information Processing SystemsJan-15-2025, 05:28:41 GMT

A deep reinforcement learning (DRL) agent observes its states through observations, which may contain natural measurement errors or adversarial noises. Several works have shown this vulnerability via adversarial attacks, but how to improve the robustness of DRL under this setting has not been well studied. We show that naively applying existing techniques on improving robustness for classification tasks, like adversarial training, are ineffective for many RL tasks. We propose the state-adversarial Markov decision process (SA-MDP) to study the fundamental properties of this problem, and develop a theoretically principled policy regularization which can be applied to a large family of DRL algorithms, including deep deterministic policy gradient (DDPG), proximal policy optimization (PPO) and deep Q networks (DQN), for both discrete and continuous action control problems. We significantly improve the robustness of DDPG, PPO and DQN agents under a suite of strong white box adversarial attacks, including two new attacks of our own.

adversarial perturbation, robust deep reinforcement learning, state observation, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Robust Deep Reinforcement Learning for Inverter-based Volt-Var Control in Partially Observable Distribution Networks

Liu, Qiong, Guo, Ye, Xu, Tong

arXiv.org Artificial IntelligenceAug-13-2024

Inverter-based volt-var control is studied in this paper. One key issue in DRL-based approaches is the limited measurement deployment in active distribution networks, which leads to problems of a partially observable state and unknown reward. To address those problems, this paper proposes a robust DRL approach with a conservative critic and a surrogate reward. The conservative critic utilizes the quantile regression technology to estimate conservative state-action value function based on the partially observable state, which helps to train a robust policy; the surrogate rewards of power loss and voltage violation are designed that can be calculated from the limited measurements. The proposed approach optimizes the power loss of the whole network and the voltage profile of buses with measurable voltages while indirectly improving the voltage profile of other buses. Extensive simulations verify the effectiveness of the robust DRL approach in different limited measurement conditions, even when only the active power injection of the root bus and less than 10% of bus voltages are measurable.

power loss, violation, voltage violation, (10 more...)

arXiv.org Artificial Intelligence

2408.06776

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
North America > United States > New York (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(2 more...)

Genre: Research Report (0.40)

Industry: Energy > Power Industry (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

A Multi-Task Approach to Robust Deep Reinforcement Learning for Resource Allocation

Gracla, Steffen, Bockelmann, Carsten, Dekorsy, Armin

arXiv.org Artificial IntelligenceApr-25-2023

With increasing complexity of modern communication systems, machine learning algorithms have become a focal point of research. However, performance demands have tightened in parallel to complexity. For some of the key applications targeted by future wireless, such as the medical field, strict and reliable performance guarantees are essential, but vanilla machine learning methods have been shown to struggle with these types of requirements. Therefore, the question is raised whether these methods can be extended to better deal with the demands imposed by such applications. In this paper, we look at a combinatorial resource allocation challenge with rare, significant events which must be handled properly. We propose to treat this as a multi-task learning problem, select two methods from this domain, Elastic Weight Consolidation and Gradient Episodic Memory, and integrate them into a vanilla actor-critic scheduler. We compare their performance in dealing with Black Swan Events with the state-of-the-art of augmenting the training data distribution and report that the multi-task approach proves highly effective.

machine learning, reinforcement learning, scheduler, (14 more...)

arXiv.org Artificial Intelligence

2304.1266

Country:

Europe > Germany > Bremen > Bremen (0.28)
North America > United States (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)

Genre: Research Report (0.82)

Industry:

Education (0.88)
Health & Medicine > Consumer Health (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Robust Deep Reinforcement Learning through Bootstrapped Opportunistic Curriculum

Wu, Junlin, Vorobeychik, Yevgeniy

arXiv.org Artificial IntelligenceJan-9-2023

Despite considerable advances in deep reinforcement learning, it has been shown to be highly vulnerable to adversarial perturbations to state observations. Recent efforts that have attempted to improve adversarial robustness of reinforcement learning can nevertheless tolerate only very small perturbations, and remain fragile as perturbation size increases. We propose Bootstrapped Opportunistic Adversarial Curriculum Learning (BCL), a novel flexible adversarial curriculum learning framework for robust reinforcement learning. Our framework combines two ideas: conservatively bootstrapping each curriculum phase with highest quality solutions obtained from multiple runs of the previous phase, and opportunistically skipping forward in the curriculum. In our experiments we show that the proposed BCL framework enables dramatic improvements in robustness of learned policies to adversarial perturbations. The greatest improvement is for Pong, where our framework yields robustness to perturbations of up to 25/255; in contrast, the best existing approach can only tolerate adversarial noise up to 5/255. Our code is available at: https://github.com/jlwu002/BCL.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2206.10057

Country:

North America > United States > Missouri > St. Louis County > St. Louis (0.04)
North America > United States > Maryland > Baltimore (0.04)

Genre: Research Report > New Finding (0.66)

Industry:

Leisure & Entertainment > Games (0.67)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Robust Deep Reinforcement Learning for Extractive Legal Summarization

Nguyen, Duy-Hung, Nguyen, Bao-Sinh, Nghiem, Nguyen Viet Dung, Le, Dung Tien, Khatun, Mim Amina, Nguyen, Minh-Tien, Le, Hung

arXiv.org Artificial IntelligenceNov-23-2021

Automatic summarization of legal texts is an important and still a challenging task since legal documents are often long and complicated with unusual structures and styles. Recent advances of deep models trained end-to-end with differentiable losses can well-summarize natural text, yet when applied to the legal domain, they show limited results. In this paper, we propose to use reinforcement learning to train current deep summarization models to improve their performance in the legal domain. To this end, we adopt proximal policy optimization methods and introduce novel reward functions that encourage the generation of candidate summaries satisfying both lexical and semantic criteria. We apply our method to training different summarization backbones and observe a consistent and significant performance gain across three public legal datasets.

arxiv preprint arxiv, dataset, summarization, (9 more...)

arXiv.org Artificial Intelligence

2111.07158

Country:

North America > United States (0.93)
Oceania > Australia (0.14)
Asia > Vietnam > Hưng Yên Province > Hưng Yên (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry:

Law (1.00)
Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)

Add feedback